Database design and Query construction for citations in academic publications

Abstract

This was the final Project delivered for the "Data Science" course ("Computational Management of Data") taught by Silvio Peroni @ University of Bologna, MA "DHDK".

The goal of the project is to develop a software that enables one to process data stored in different formats and to upload them into two distinct databases to query these databases simultaneously according to predefined operations. The software must be accompanied by a document (i.e., a Jupyter notebook) describing the data to process (their main characteristics and possible issues) and how the software has been organised (name of the files, where have been defined the various Python classes, etc.). The required structure of the database as well as its required functioning queries were provided by the professor as a UML model and we had to construct it in python.

The documentation and our final code can be found in the project repository on Github link provided below.

Related Hyper-Links

Key Learnings

Object Oriented Programming
Python
SQLite Database Design
Graph Database Design
Query Writing
Data Analysis
Software Design
Pandas